Search CORE

20 research outputs found

An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

Author: Alexeev Yuri
D'mello Michael
Gordon Mark S.
Keipert Kristopher
Mironov Vladimir
Moskovsky Alexander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2017
Field of study

Modern OpenMP threading techniques are used to convert the MPI-only Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two separate implementations that differ by the sharing or replication of key data structures among threads are considered, density and Fock matrices. All implementations are benchmarked on a super-computer of 3,000 Intel Xeon Phi processors. With 64 cores per processor, scaling numbers are reported on up to 192,000 cores. The hybrid MPI/OpenMP implementation reduces the memory footprint by approximately 200 times compared to the legacy code. The MPI/OpenMP code was shown to run up to six times faster than the original for a range of molecular system sizes.Comment: SC17 conference paper, 12 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Developing a Computational Chemistry Framework for the Exascale Era

Author: Bertoni Colleen
Boschen Jeffrey S.
de Jong Wibe A.
Harrison Robert J.
Keipert Kristopher
Pritchard Benjamin
Richard Ryan M.
Valeev Edward F.
Windus Theresa L.
Publication venue: Iowa State University Digital Repository
Publication date: 06/12/2018
Field of study

Within computational chemistry, the NWChem package has arguably been the de facto standard for running high-accuracy numerical simulations on the most powerful supercomputers. In order to better address the challenges presented by emerging exascale architectures, the decision has been made to rewrite NWChem. Design of the resulting package, NWChemEx, has been driven by exascale computing; however, significant additional design considerations have arisen from the team\u27s involvement with the Molecular Sciences Software Institute (MolSSI). MolSSI is a National Science Foundation initiative focused on establishing coding and data standards for the computational chemistry community. As a result, NWChemEx is built upon a general computational chemistry framework called the simulation development environment (SDE) that is designed with a focus on extensibility and interoperability. The present manuscript describes the modular approach of the SDE and how it has been used to implement the self-consistent field algorithm within NWChemEx

Digital Repository @ Iowa State University (ISU)

eScholarship - University of California

Knowledge is power: Quantum chemistry on novel computer architectures

Author: Keipert Kristopher
Publication venue
Publication date: 01/01/2017
Field of study

In the first chapter of this thesis, a background of fundamental quantum chemistry concepts is provided. Chapter two contains an analysis of the performance and energy efficiency of various modern computer processor architectures while performing computational chemistry calculations. In chapter three, the processor architectural study is expanded to include parallel computational chemistry algorithms executed across multiple-node computer clusters. Chapter four describes a novel computational implementation of the fundamental Hartree-Fock method which significantly reduces computer memory requirements. In chapter five, a case study of quantum chemistry two-electron integral code interoperability is described. The final chapters of this work discuss applications of quantum chemistry. In chapter six, an investigation of the esterification of acetic acid on acid-functionalized silica is presented. In chapter seven, the application of ab initio molecular dynamics to study the photoisomerization and photocyclization of stilbene is discussed. Final concluding remarks are noted in chapter eight.</p

Digital Repository @ Iowa State University (ISU)

Analyzing the Performance and Accuracy of Lossy Checkpointing on Sub-Iteration of NWChem

Author: Calhoun Jon
Cappello Franck
Di Sheng
Keipert Kristopher
Liang Xin
Reza Tasmia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2019
Field of study

Future exascale systems are expected to be characterized by more frequent failures than current petascale systems. This places increased importance on the application to minimize the amount of time wasted due to recompution when recovering from a checkpoint. Typically HPC application checkpoint at iteration boundaries. However, for applications that have a high per-iteration cost, checkpointing inside the iteration limits the amount of re-computation. This paper analyzes the performance and accuracy of using lossy compressed check-pointing in the computational chemistry application NWChem. Our results indicate that lossy compression is an effective tool for reducing the sub-iteration checkpoint size. Moreover, compression error tolerances that yield acceptable deviation in accuracy and iteration count are quantified

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Dynamics Simulations with Spin-Flip Time-Dependent Density Functional Theory: Photoisomerization and Photocyclization Mechanisms of cis-Stilbene in ππ* States

Author: Gordon Mark
Gordon Mark
Harabuchi Yu
Keipert Kristopher
Taketsugu Tetsuya
Zahariev Federico
Publication venue
Publication date: 01/09/2014
Field of study

On-the-fly dynamics simulations were carried out using spin-flip time dependent density functional theory (SF-TDDFT) to examine the photoisomerization and photocyclization mechanisms of cis-stilbene following excitation to the ππ* state. A state tracking method was devised to follow the target state among nearly degenerate electronic states during the dynamics simulations. The steepest descent path from the Franck–Condon structure of cis-stilbene in the ππ* state is shown to reach the S1-minimum of 4,4-dihydrophenanthrene (DHP) via a cis-stilbene-like structure (referred to as (S1)cis-min) on a very flat region of the S1-potential energy surface. From the dynamics simulations, the branching ratio of the photoisomerization is calculated as trans:DHP = 35:13, in very good agreement with the experimental data, trans:DHP = 35:10. The discrepancy between the steepest descent pathway and the significant trans-stilbene presence in the branching ratio observed experimentally and herein computationally is clarified from an analysis of geometrical features along the reaction pathway, as well as the low barrier of 0.1 eV for the pathway from (S1)cis-min to the twisted pyramidal structure on the S1-potential energy surface. It is concluded that ππ*-excited cis-stilbene propagates primarily toward the twisted structural region due to dynamic effects, with partial branching to the DHP structural region via the flat-surface region around (S1)cis-min.Reprinted (adapted) with permission from Journal of Physical Chemistry A 118 (2014): 11987, doi:10.1021/jp5072428. Copyright 2014 American Chemical Society.</p

Digital Repository @ Iowa State University (ISU)

Effect of frequency scaling granularity on energy-saving strategies

Author: Eddy SR
Hsu CH
Kristopher Keipert
Mark S Gordon
Masha Sosonkina
Vaibhav Sundriyal
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Excited state properties of 5-formylcytosine and 5-hydroxymethylcytosine

Author: Clancy S.
Joani Mato
Kristopher Keipert
Mark S. Gordon
Roos B.O.
Schmidt M.W.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Energy-Efficient Computational Chemistry: Comparison of x86 and ARM Systems

Author: Gordon Mark
Gordon Mark
Keipert Kristopher
Mitra Gaurav
Rendell Alistair
Sok Leang Sarom
Sosonkina Masha
Sunriyal Vaibhav
Publication venue
Publication date: 01/10/2015
Field of study

The computational efficiency and energy-to-solution of several applications using the GAMESS quantum chemistry suite of codes is evaluated for 32-bit and 64-bit ARM-based computers, and compared to an x86 machine. The x86 system completes all benchmark computations more quickly than either ARM system and is the best choice to minimize time to solution. The ARM64 and ARM32 computational performances are similar to each other for Hartree–Fock and density functional theory energy calculations. However, for memory-intensive second-order perturbation theory energy and gradient computations the lower ARM32 read/write memory bandwidth results in computation times as much as 86% longer than on the ARM64 system. The ARM32 system is more energy efficient than the x86 and ARM64 CPUs for all benchmarked methods, while the ARM64 CPU is more energy efficient than the x86 CPU for some core counts and molecular sizes.Reprinted (adapted) with permission from Journal of Chemical Theory and Computation 11 (2015): 5055, doi:10.1021/acs.jctc.5b00713. Copyright 2015 American Chemical Society.</p

Digital Repository @ Iowa State University (ISU)